Word Midas Powered by StringNet: Discovering Lexicogrammatical Constructions in Situ

نویسندگان

  • David Wible
  • Nai-Lung Tsao
چکیده

Adult second language learners face the daunting but underappreciated task of mastering patterns of language use that are neither products of fully productive grammar rules nor frozen items to be memorized. Word Midas, a web browser extension, targets this uncharted territory of lexicogrammar by detecting multiword tokens of lexicogrammatical patterning in real time in situ within the noisy digital texts from the user’s unscripted web browsing or other digital venues. The language model powering Word Midas is StringNet, a densely cross-indexed navigable network of one billion lexicogrammatical patterns of English. These resources are described and their functionality is illustrated with a detailed scenario.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

StringNet as a Computational Resource for Discovering and Investigating Linguistic Constructions

We describe and motivate the design of a lexico-grammatical knowledgebase called StringNet and illustrate its significance for research into constructional phenomena in English. StringNet consists of a massive archive of what we call hybrid n-grams. Unlike traditional n-grams, hybrid n-grams can consist of any co-occurring combination of POS tags, lexemes, and specific word forms. Further, we d...

متن کامل

Word similarity using constructions as contextual features

1 We propose and implement an alternative source of contextual features for word similarity detection based on the notion of lexicogrammatical construction. On the assumption that selectional restrictions provide indicators of the semantic similarity of words attested in selected positions, we extend the notion of selection beyond that of single selecting heads to multiword constructions exerti...

متن کامل

The StringNet Lexico-Grammatical Knowledgebase and its Applications

This demo introduces a suite of web-based English lexical knowledge resources, called StringNet and StringNet Navigator (http://nav.stringnet.org), designed to provide access to the immense territory of multiword expressions that falls between what the lexical entries encode in lexicons on the one hand and what productive grammar rules cover on the other. StringNet’s content consists of 1.6 bil...

متن کامل

Discovering Light Verb Constructions and their Translations from Parallel Corpora without Word Alignment

We propose a method for joint unsupervised discovery of multiword expressions (MWEs) and their translations from parallel corpora. First, we apply independent monolingual MWE extraction in source and target languages simultaneously. Then, we calculate translation probability, association score and distributional similarity of co-occurring pairs. Finally, we rank all translations of a given MWE ...

متن کامل

Discovering and Analyzing the Intellectual Structure and Its Evolution in Core Journals of "Knowledge and Information Science" during 2004-2013

Purpose: This study aims to reveal the intellectual structure of Knowledge and Information Science and its evolution along with the review of journals subjective scope based on 6830 abstract in the ten core journal in the JCR 2013, over the ten years (2004-2013). Methodology: In this research, co-word and Correspondence analysis of 150 words -selected by tf-idf weight- were done after parametri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016